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Abstract 

Cremer-Pople puckering coordinates appear to be the natural candidate variables to explore the 
conformational space of cyclic compounds, and in literature different parametrizations have been 
used to this end. However, while every parametrization is equivalent in identifying conformations, 
it is not obvious that they can also act as proper collective variables for the exploration of the 
puckered conformations free energy surface. It is shown that only the polar parametrization is 
fit to produce an unbiased estimate of the free energy landscape. As an example, the case of a 
six-membered ring, glucuronic acid, is presented, showing the artefacts that are generated when a 
wrong parametrization is used. 
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I. INTRODUCTION 



Cyclic and heterocyclic compounds plays a particularly relevant role in many chemical 
and biological processes. Carbohydrates, for example, are one of the fundamental building 
blocks of the biochemical structure and activity of the cell, including energy transport, 
cell recognition and signaling. Knowing the structural and conformational properties of 
cyclic compounds is therefore a task of primary interest. The quantitative description of 
puckered conformation of A^-membered rings is of fundamental importance for the physics 
and chemistry of cyclic compounds. 

After early attempts to rigorously describe puckered forms, starting from the work by 
Kilpatrikfl, the problem of a general definition of puckering coordinates was eventually 
settled by Cremer and Poplej^. A ring conformation is uniquely identified by just — 3 
parameters, and is actually representative of an infinite set of points in the (3A^ — 6)- 
dimensional space of configurations. Although puckering coordinates apply to rings of an 
arbitrary number of members, their interpretation is already not straightforward for = 
7 a. aud soon becomes a daunting task for larger N flfllchoiees other than the Cremer- 
Pople one are possible [61], although completely equivalent [T]]). If one restricts the analysis 
to six-membered rings, the most used parametrization of puckering coordinates is probably 
represented by the Cremer-Pople polar coordinates set {Q,6,(f)), that spans the configuration 
space using a radial coordinate Q (the total puckering amplitude), and the zenithal and 
azimuthal angles 6 and 0, respectively. Amongthe other possible parametrizations, the 
Cartesian one undoubtedly has some advantages 2, l8|. 

Cremer-Pople puckering coordinates are surely the correct tool to map the conforma- 
tional space of cyclic structures, but in order to understand the properties of these systems 
one needs to know the associated free energy landscape. Since the free energy differences 
between conformers (and the barriers in between) are usually rather large, standard compu- 
tational approaches like Molecular Dynamics are ineffective. Methods exist, such as umbrella 
sampling[9|, [lO| or met adynamics 111. Il2l |. that allow accelerated sampling of different con- 
formations by adding bias forces. In general, these accelerated sampling methods are based 
on the choice of a (usually small) number of collective variables (CVs). The choice of CVs 
is of particular importance for the proper reconstruction of free energy landscapes: their 
number should be small in order to speed up the sampling consistently, and they should 
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represent every slowly evolving degree of freedom of the system. Otherwise, the estimate of 



the profile could be severely biased 



13|. In this work we show that only the Cremer-Pople 



polar coordinates are suitable for use as CVs to perform accelerated sampling, while other 
parametrizations, such as the Cartesian one, introduce strong biases in the reconstruction 
of the free energy profile. As a practical example, the puckering free energy profile of the 
glucuronic acid, described using a classical force-field is calculated using two different set of 
CVs (polar and Cartesian) . The problems arising from the use of Cartesian coordinates are 
analyzed and discussed. 



II. PUCKERING COORDINATES AS COLLECTIVE VARIABLES 



Given the value of the atomic distances Zj from the mean plane of the ring, the — 3 
independent puckering coordinates are obtained as follows (for the complete derivation see 
the Appendix and Ref.j^): 




where the index m runs from 2 to (A^ — 1)/2 and to A^/2 — 1 for odd and even A^, respectively. 
The normalization factors in Eqs.([T]l3]) are such that the total puckering amplitude Q can be 
defined as 

N 

Q'^Y.'' = Y.^i (4) 

for both the even and odd cases. 

For a six-membered ring there is only one amplitude-phase pair {q2, (j)2) and a single 
puckering coordinate gs. The polar coordinate set {Q, 9, 0) and the Cartesian one {q^, Qy, Qz) 
are related to the original (g2, 02, Qs) coordinate set as 

q2 cos 02 = Q sin 6 cos = 
\ q2 sin (p2 = Q sin ^ sin = qy 
^ q^ = Qcos9 = q^. 
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It is worth noting, that while Cremers-Pople coordinates (or equivalent ones) are the only 
rigorous coordinates to map the conformation space of puckered rings, often a set of dihedral 
angles has been employed instead(see for example [l^). To our knowledge, Cremers-Pople 
coordinates have been employed as CVs to sample the puckering free energy landscape only 
in a recent work by Biarnes and coworkers [l^, where the authors employed the Cartesian 
parametrization, though with an immaterial opposite sign for the phase 0. 

Let us now turn to the specific problem of using Cremer-Pople coordinates as CVs for 

metadynamics. Cyclic compounds are usually characterized by the presence of a number of 

metastable conformations, which are generally separated by high free energy barriers, and a 

proper sampling of the conformational free energy landscape can be achieved only using some 

accelerated methods. A modern and efficient class of adaptive algorithms is represented by 

j 

metadynamics, in its various flavors[13]. The trait common to every metadynamics algorithm 
is the presence of a history-dependent potential that drives the system out of the regions 
already visited. In order for the method to be efficient, the biasing potential has to be a 
function of a small number of CVs, that should also be chosen so that the bias forces allow 
the system to reach every point in the CV space. This ergodicity requirement is generally 
presented by saying that every slow degree of freedom has to be represented by a CV. 

In principle, for a six-membered ring structure, 3 parameters are sufficient to map the 
interesting regions of the conformation space. However, to obtain a reasonable description 
of a system with rigid or nearly-rigid bonds, only two coordinates are needed. This can be 
understood by noticing that the total puckering amplitude Q gives a measure of the quadratic 
displacement (EqJlj) from the middle plane. The value of this displacement is thus bound 
by the finite extension of the chemical bonds, and the density of states is therefore peaked 
in a thin spherical shell of radius Q. In general, only one minimum is present in the profile 
of Q, which usually has a unimodal distribution, and is definitely a fast degree of freedom. 
One can thus safely avoid using Q as a CV, retaining only the two remaining orthogonal 
coordinates, which, in the polar representation, are parametrized by the angles 9 and (j). 
The choice of the pair (6',0) as CVs does not create any ergodicity problem, because (a) Q 
is a fast degree of freedom and (b) the bias forces are always tangent to the sphere surface, 
thus allowing the metadynamics to reach every point of the conformation space. 

Conversely, if one wants to directly use the Cartesian coordinates as CVs, a number of 
problems arise. The bias forces associated with and Qy always point in a direction parallel 
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to the equatorial plane. This has two simple but important consequences. 

Firstly, if one only uses two CVs, namely, qx and g^, once the system is driven to the 
equatorial line, the bias forces are no longer able to force it to move along the zenithal 
direction. Any transition across the equatorial line has to take place only due to real forces 
and thermal fluctuation. If the free energy landscape presents a barrier higher than the 
thermal energy, it will be virtually impossible for the system to transit the equatorial line, 
thus rendering the method not ergodic at the practical level. This is why, for example, in a 
recent investigation of the puckerin g fr ee energy landscape of a /3-glucopyranoside performed 



using Car-Parrinello met adynamics [ 1 51] , Biarnes and coworkers found that, during their sim- 
ulation run, the system never explored the southern hemisphere. Moreover, the fact that 
the strength of the bias forces along the 9 direction decreases as cos(^) means that the depth 
of free energy wells and height of free energy barriers along the radial direction in the g^, % 
plane will be systematically overestimated. The error becomes more severe as the system 
approaches the equatorial line. The steep free energy barrier at high values of + q^ which 
has been observed in [l^ and which the metadynamics was unable to surmount is precisely 
an artefact generated by this biased sampling. 

Secondly, since the bias forces are not only softened along the 9 direction, but are also 
increased along the radial one, they will start to force the system to explore regions with 
values of Q far from the equilibrium ones, as soon as the system departs from the polar 
regions (even if the third CV, gz, is also employed). The reconstruction of the free energy 
profile will therefore be unavoidably biased, by the sampling of unwanted conformations at 
unphysical values of the total puckering amplitude. 

We wish to stress that all these problems are not related to puckering coordinates in 
general, but are due to the fact that systems of physical interest usually present a density 
of states which is concentrated in a thin spherical shell. In order to be able to sample the 
entire configuration space, bias forces have to be tangent to the sphere surface. Therefore, 
only linear combinations of the unit vectors 9 and are suitable to define the CV space. 

III. SIMULATION METHODS AND RESULTS 

As a practical example, we investigated the puckering free energy profile of glucuronic 
acid, employing both {qx,qy) and (6',0) as CVs. The molecule was modeled using the clas- 
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sical, united-atoms, G45A4[l6| forcefield. The system was coupled to a thermal bath by 
integrating the Langevin equations of motions with a timestep of 1 fs and a friction coeffi- 
cient of 0.1 ps^^. We employed the well-tempered variant of the metadynamics algorithm 
as implemented in the grometa simulation package [l8, Q, 3 • The grometa code 
was modified to implement the Cartesian and polar puckering coordinates. 

All simulations have been performed at a temperature of 300 K. A cut-off radius of 1 nm 
has been applied for every nonbonded interaction. Gaussians were placed in the CV space 
every 200 integration steps, using a starting height of 1.0 kJ/mol and a temperature window 
AT = 2000K for the well-tempered algorithm. In addition, the width of the Gaussians has 
been adapted runtime by rescaling it every 1000 steps to a factor 0.2 of the root mean square 
distance of the associated CV, evaluated in the last 1000-step window. During every run, 
independently of which CVs were used, the values of Q, 9 and (p were collected, in order 
to be able to investigate in parallel every possible conformational parameter. The starting 
conformation was the ^Ci chair (located approximately at = 0) for all simulations but one, 
in which the ergodicity was tested by starting the simulations from the ^C4 inverted chair 
(located at approximately 6 = ir). All energies reported in the following are referred to the 
global minimum. 

The first point we addressed regarded the validity of the assumption that Q can be 
excluded from the set of CVs needed to sample the puckered conformations free energy 
landscape. In order to do this, we performed metadynamics using all three polar CVs, and 
averaged out the two remaining degrees of freedom, thus obtaining an integral profile for Q. 
The estimate for P{Q), the probability distribution of Q, obtained by the exponentiation of 
the free energy profile, is reported in FiglH Indeed, P{Q) is characterized by a pronounced 
and narrow peak located around the value of 0.05 and is unimodal, thus satisfying the 
requirements needed to be considered a fast degree of freedom, and to justify the assumption 
of a quasi-two-dimensional, spherical conformational space. In the following analysis, we 
restricted the metadynamics to sample only the two dimensional spaces {qx,%) and (6',0). 

The puckering free energy profile calculated using the 9 and CVs is presented in Figj2l 
The full CV space has been spanned by the metadynamics run, and a deep rift can be 
observed at 6' ~ 0, which contains the global minimum corresponding to the ^Ci conforma- 
tion around = and some slightly distorted chairs around = 7r/2 and 37r/2. The next 
conformer which can be observed is a °''^B close to the border of the diagram, at 7r/2 
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FIG. 1: Probability density for the radial coordinate Q. 

and ~ 0. In the middle of the diagram (^^ ~ 7r/2, ~ tt) a local minimum corresponding 
to the Bo,3 conformer is present. A steep barrier then has to be overome, right after the 
equatorial line, to reach the conformation, the inverted chair, located at around 6 n. 

By employing the polar coordinates, the exploration of the southern hemisphere was not 
a problem, and the barrier which has been found located close to the equatorial line is 
definitely high 40 kJ/mol), but easily surmountable with the metadynamics technique. 
On the contrary, the problem of ergodicity is evident if one employs Cartesian coordinates 
to perform the metadynamics exploration. Since the Qx and qy coordinates alone can not 
distinguish between northern and southern hemisphere, we tracked the value of cos(^), and 
were thus able to compute (see FigjS]) the free energy landscape as well as the distribution 
of the Gaussians placed in either of the two hemispheres. It should be noted that the 
free energy G is a logarithmic measure of the number of conformations s visited during the 
metadynamics run: G{s) ~ log [/ Ss,s{t')(^t'] ■ Therefore, the plots obtained by separating out 
the contributions G'^^s) ~ log [J 5s^s{t')H{±cos{9))dt'~\ of the conformations in the northern 
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FIG. 2: Puckering free energy profile calculated using polar coordinates {0,(j)). Every isoline 
corresponds to an increment in energy of 5 kJ/mol. (Darker colors correspond to lower energies). 

and southern hemisphere, respectively, are only indicative of the sampled regions (here 
H denotes the Heavside function). The two contributions are not to be considered 
as free energies and, in particular, they do not add up to the real free energy landscape: 
G{s) 7^ G^{s) + G^{s). Nevertheless, the contributions from the two hemispheres are 
quite informative, and the first thing one notices from their graphs (see Fig. [3], middle 
and bottom panels) is that only the northern hemisphere has been explored completely 
during the metadynamics run. This result qualitatively reproduces the behavior observed 
by Biarnes and coworkers [isl. who noticed a sampling of the northern hemisphere only, and 
attributed to the presence of an extremely high free energy barrier at the equatorial line. 

During our metadynamics run, the southern hemisphere was reached thanks to natural 
fluctuations, but the sampled region is relatively small, and the metadynamics algorithm was 
definitely not able to be ergodic, even though a tenfold longer simulation time and the same 
Gaussian heights have been used with respect to the metadynamics with polar coordinates. 



8 



^Ci ^C4 Q'=^B Bo,3 
AGt 0.0 27.7 9.3 20.0 
(9,(1)) (0.2,6.1) (2.9,6.1) (1.4,0.0) (1.5,3.4) 

AG^ 0.0 n.a. 28.7 33.1 
iqx,qy) (0.0,0.0) n.a. (0.5,5.9) (-0.5,-6.4) 

TABLE I: The free energy (AG, in kJ/mol) of different conformations, estimated using the two 
different sets of Cartesian^ and polar^ coordinates, along with the location of the conformations on 
the (6',(/)) plane (in rad) and (qx^Qy) plane (in 10~^nm), respectively. 

For comparison, the estimate of the barrier height at the equatorial hue obtained using 
Cartesian coordinate is more than 90 kJ/mol, compared with the value of about 40 kJ/mol 
estimated using polar coordinates. The presence of this artefact, as discussed in Section UTt 
prevents the system from exploring the southern hemisphere properly and the metadynamics 
from being ergodic. The magnitude of the bias which is introduced by using Cartesian 
coordinates can be appreciated by looking at the differences in energy of some selected 
conformations with that of the ^C4 chair, presented in TabUl 

The extent of the artefacts introduced by the use of Cartesian CVs can also be appreciated 
by looking at the effect that the mixing of tangent and radial coorinates has on the range of 
sampled conformations at different Q. In FigJUthe correlation between Q and the magnitude 
of the projected puckering vector = a/ + Qy is presented. At low values of qr, that is, 
at low values of the zenithal angle 6, the correlation with Q is minimal, and the radial 
coordinate is distributed around the value of 0.055 as in the free case. At values of 
larger than 0.05, on the other hand, a strong correlation develops, which almost completely 
correlates the two variables along the qr/Q = sin(^) = 1 line. This means that when the 
system is driven close to the equatorial line by the metadynamics, conformations that are 
sampled at higher values of 6 are necessarily characterized by an unphysically high total 
puckering amplitude, strongly biasing the reconstructed free energy profile. 



9 



-0.08 -0.04 0.04 0.08 
(nm) 



FIG. 3: Top panel: free energy profile generated using the qx,qy pair of CVs. Middle and bottom 
panels: logarithm of the distribution of the Gaussians placed in the northern (G^) and southern 
(G^) hemispheres, respectively, during the metadynamics run. Every isoline corresponds to an 
increment in energy of 5 kJ/mol. Note that the figures in the middle and bottom panels do not 
have not the meaning of a free energy; in these cases isolines have to be understood only as a guide 
to the eye. 
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FIG. 4: Bivariate distribution P{qr,Q) of the projection qr = ■\Jqx ~^ Qy = Qsm{6) versus the 
total puckering amphtude Q during a metadynamics run using qx £ind qy as CVs. A strong cor- 
relation develops as the qr/Q = sin[9) = 1 line is approached (Lighter colors correspond to lower 
probabilities) . 

IV. CONCLUSIONS 

We have critically investigated the use of two different sets of CVs which can be used 
for the study of puckered conformations of cyclic compounds. In this work we have shown 
that the Cartesian and polar sets of coordinates are not equivalent, and that only the polar 
one is a proper set of CVs that allows the ergodic and unbiased sampling of the space of 
different puckered conformations for six-membered rings. We supported our conclusions by 
performing a metadynamics calculation of the puckering free energy surface of a glucuronic 
acid ring, described by a simple atomistic model in vacuo using the GROMOS G45A4 force- 
field. We showed how the metadynamics performed using Cartesian CVs - in contrast to 
the polar case - is non-ergodic, and how the reconstructed free energy profile suffers from 
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strong biases. The conformations sampled close to the equatorial line had unphysical values 
of the total puckering amplitude, which has been shown to be markedly correlated with the 
proximity to the equatorial line. 

In conclusion, the non-ergodicity and the biases are too high a price to be paid for the 
simplifications deriving from the use of Cartesian coordinates to represent puckered confor- 
mations. To the best of our knowledge, only the Cartesian variant has been employed so 



far 



15| as a CV set for an accelerated sampling method, namely, metadynamics. In contrast, 



our analysis suggests that only polar coordinates should be considered as a proper set of 
CVs when studying the puckered conformations of ring structures via enhanced sampling 
techniques such as metadynamics. 
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Appendix: Gradients of the puckering coordinates 

For a metadynamics run it is necessary to compute gradients of the CVs, in order to 
include the bias contribution to the force acting on the system. To perform this task in this 
case it is necessary to recall some details in puckering coordinates construction. 

The starting point are the position vectors (here the index j runs from 1 to an arbitrary 
A^) of the atoms of the ring, with respect to an arbitrary reference frame in which we will 
compute derivatives. From these vectors it is possible to calculate the geometrical center 
of the ring 11=^ Ylj use this point as the origin of a new reference frame for the 

system, in which the atomic coordinate will be (summation on repeated indices is implied) 

R,- = r, - R = r^A,, , A,, = 5^,- - l/AT (5) 

{i = 1, . . . , N, and 6ij is the Kronecker symbol). In order to build all puckering coordinates 
the projections zj = Rj ■ fi onto the mean plane axis h are required. The mean plane axis 
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can be univocally defined as 



a 



n 



R' X R" 
|R' X R"| 



R' = J^j ^jWij 



(6) 



where the symbols 



Wm,j = sm ■ 



27rm(j — 1) 



are defined here for further use. 

The general puckering coordinate can now be rewritten as an explicit function of the 
atomic projections as 



27rm(j — 1) 



(7) 



qm = ^/2/N^/^^+M 

(prn = arctan [ - s/rn/ ^m] 



(8) 



making use of the definitions 



(9) 



to simplify the notation. All puckering coordinates are eventually a function of the projec- 
tions Zj, whose gradient is 



Rj ■ (R^ X R") 
|R' X R"| 



-ZjVi |R' X R'f ^ Vi [Rj ■ (R' X R")] 



(10) 



2|R' X R"|2 ' |R' X R"| 
Noting that Vj(Rj ■ e^) = AyCa (with an element of the vector basis) after a some- 
what long but straightforward calculation the terms of the right member in Eq.l fTOl) can be 
evaluated to be 

|2 



Vi |R' X R'T = 2AikW,,k [R'IR'f - R"(R" ■ R' 
+2A,fet;i,fc[R"|Rf -R'(R"-R' 



and 



Vi [Rj ■ (R' X R")] = AijR' X R" + AikWi^kR" x Rj+ 

+AikVi^kRj X R'. 
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